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EXPONENTIAL RANDOM SIMPLICIAL COMPLEXES 


KONSTANTIN ZUEV, OR EISENBERG, AND DMITRI KRIOUKOV 


Abstract. Exponential random graph models have attracted significant research attention over the past 
decades. These models are maximum-entropy ensembles subject to the constraints that the expected values 
of a set of graph observables are equal to given values. Here we extend these maximum-entropy ensembles to 
random simplicial complexes, which are more adequate and versatile constructions to model complex systems 
in many applications. We show that many random simplicial complex models considered in the literature 
can be casted as maximum-entropy ensembles under certain constraints. We introduce and analyze the 
most general random simplicial complex ensemble A with statistically independent simplices. Our analysis 
is simplified by the observation that any distribution P(O) on any collection of objects O = {O}, including 
graphs and simplicial complexes, is maximum-entropy subject to the constraint that the expected value 
of — lnP(O) is equal to the entropy of the distribution. With the help of this observation, we prove that 
ensemble A is maximum-entropy subject to the two types of constraints which fix the expected numbers of 
simplices and their boundaries. 

Keywords. Random simplicial complexes, random graphs, maximum-entropy distributions, exponential 
random graphs model, network models. 


1. Introduction 


When studying complex systems consisting of many interconnected, interacting components, it is rather 
natural to represent the system as a graph or, more generally, as a simplicial complex. Modeling complex 
systems with graphs has proved to be useful for understanding systems as intricate as the Internet, the 
human brain, and interwoven social groups, and has led to a new area of research, called network science 


12 14 35 


A host of developed network models (e.g. se e [21 for a survey) can be roughly divided into two classes: 
“generative” models and “descriptive” models [1], Generative models are algorithms which describe how 
to generate a network using some probabilistic rules for connecting nodes. These models primarily aim to 
uncover the hidden evolution mechanisms responsible for certain properties observed in real networks. A 
classical, and perhaps the simplest and best studied, example of a generative model is the Erdos-Renyi 
random graph G(n,p) 18} [43]: given n nodes, place a link between every two nodes independently 


at random with probability p. Among other prominent examples are the preferential attachment model 
[4j 13 30 and the small-world model |36]|47 48 which explain the power-law degree distributions and small 
distances between most nodes, two universal properties observed in many real networks. Any generative 
model gives rise to an ensemble (C/,P), where Q is a set of all graphs the model can possibly generate and 
P is the probability distribution on Q , where P(G) is the probability that the model generates G £ Q. 
One can always readily sample from P (using the network generating algorithm), but often cannot obtain 
a closed-form expression for P(G), or even implicitly describe P as a solution of some optimization problem 
equation. 

Generative models can help to understand the fundamental organizing principles behind real networks 
and explain their qualitative behavior, but they are not specifically designed for network data analysis. 
Descriptive models attempt to fill this gap. A descriptive model is explicitly defined as an ensemble (Q,Fg), 
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where Q is a set of graphs and P g is the joint probability distribution on Q parameterized by a vector of 
parameters 0, which are to be inferred from the observed network data. For any graph G £ Q, a descriptive 
model gives a closed-form expression for Pg(G) which can be used for further statistical inference, e.g. for 
estimating ensemble averages J^Geg *(G)Pe(G), where x is a network property of interest. In contrast to 
generative models, however, a descriptive model does not specify how to sample networks from Pg, which 
is often a challenging task. In simple cases, a network model can be represented as both generative and 
descriptive model. For example, the Erdos-Renyi random graph G(n,p) can be defined, as above, by a 
generative algorithm, or by the formula for the probability distribution P(G) = pb(G) (l — p)( 2)^/1 ( G ), where 
/i(G) is the number of edges in G. In general, however, representing a generative model as descriptive (and 
vice versa) is a very difficult problem whose solution could be very useful for applications. 

Exponential random graphs (ERGs) 19 23 29,37,42 45], often called p* models in the social network 
research community [3 40,46], are among the most popular and best studied descriptive models which 
provide a conceptual framework for statistical modeling of network data. Let Q n be the set of all simple 
graphs (without self-loops or multi-edges) with n nodes, x ±,..., x r be functions on Q n , henceforth referred to 
as the graph observables, and let x\,... , x r be the values of these observables X\ (G),..., x r (G) for a network 
of interest G £ Q n computed from available network data. The ERG model defined by G and its observables 
Xi ,..., x r is the exact analog of the Boltzmann distribution in statistical mechanics: 

e -H e (G ) r 


Pe(G) = 


m 


H e {G) = Y,0i x ii G ), 


( 1 . 1 ) 


where Hg(G) is called the graph Hamiltonian, Z(9) the partition function (the normalization constant), and 
9 = (0i,..., 9 r ) is a vector of model parameters which satisfy 

d\a.Z 


d9i 


= Xi. 


( 1 . 2 ) 


Whereas originally o> was simply postulated and used in empirical studies [23], it was later recog¬ 
nized 19 37 45 that ERGs are maximum-entropy ensembles. Namely, the distribution defined by (1.1) and 


(1.2) maximizes the Gibbs entropy 


S(P) = - P(G) lnP(G), 
Gea „ 

subject to the r “soft” constraints and the normalization condition 

Ep[s<] = 51 Xi(G)P(G) = m, 

Ge6„ 

55 P(G) = 1. 

Gee™ 


(1.3) 

(1.4) 

(1.5) 


The general principle of maximum entropy is thoroughly reviewed in 38 . In the context of complex networks, 
the principle of maximum entropy and different entropy measures are discussed in [2]. Despite some known 
problems with ERGs with nonlinearly correlated constraints 17,24 41 , ERGs remain one of the most popular 
descriptive models for network data analysis, especially in social science. 

In many cases, however, representing a complex system with a simplicial complex — a higher-dimensional 
analog of a graph — is conceptually more sound than the basic network representation, and provides a 
“higher order approximation” of the system. Consider for example a social system of scientific collaboration. 
Three researchers may co-author a single article or they may have three different papers with two authors 
each. The network representation, where nodes are connected if the corresponding scientists co-authored a 
paper, will not distinguish between these two cases. But we can do this by placing (in the former case), 
or not (in the latter case), a 2-simplex on the three nodes. This is illustrated in Fig [l] Other examples, 
where the simplicial complex representation is more accurate include biological protein-interaction systems, 
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FIGURE 1. Networks vs simplicial complexes. The outcome of collaboration between scien¬ 
tists Z, E, and K could be three different papers co-authored by Z and E, E and K, and K and Z 
(Panel (a)), or a single paper co-authored by all three scientists (Panel (b)). While the network 
representation does not distinguish between these two case and results in the graph in Panel (a), 
the simplicial complex representation does by adding in the latter case the triangle {Z,E,K} in 
Panel (b). 


where proteins form protein complexes often consisting of more than two proteins, economic systems of 
financial transactions often involving several parties, and social systems, where groups of people are united 
by a common motive, interest, or goal, as opposed to merely being pairwise connected. 

In general, compared to graphs, simplicial complexes encode more relevant information about a complex 
system, and make possible modeling beyond dyadic interactions. They have been used in many applications, 
including modeling social aggregation 28 , agent interaction 44 , opinion formation and dynamics 32 33 


coverage and hole-detection in sensor networks 20 , and broadcasting in wireless networks 39 , to name just 
a few. We remark that prior to their being used for studying complex interactions, simplicial complexes 
were used in a rich variety of geometric problems, ranging from grid generation in finite element analysis to 
modeling configuration spaces of dynamical systems 


16 


15 . Further details and applications can be found in 


In this paper, we introduce exponential random simplicial complexes (ERSCs) which are higher dimen¬ 
sional generalizations of exponential random graphs, develop the formalism for ERSCs, and show that several 
popular generative models of random simplicial complexes — random flag complexes 
complexes 


31 , and Kahle’s multi-parameter model 27 


25 , Linial-Meshulam 


— can all be explicitly represented as ERSCs. We 
also introduce the most general ensemble of random simplicial complexes A with statistically independent 
simplices, and show that this ensemble is an ERSC ensemble as well. 

2. Basic Definitions and Notations 

Here we recall a few basic definitions and introduce notation that we use throughout the paper. For a 


comprehensive reference on simplicial complexes the reader is referred to 34 


A simplicial complex C on n vertices V = {1,... ,n} is a collection of non-empty subsets of V, called 
simplices. Complex C contains all vertices, {*} £ C, and is closed under the subset relation: if a € C and 
r C er, then r £ C, where r is called a face of simplex er, and er is a coface of r. A simplex a is called a 
k-simplex of dimension k if its cardinality is \a\ = k + 1. It is useful to think of a fc-simplex as the convex 
hull of (k -f 1) points in general position in K > k 22 . For instance, 0-, 1-, 2-, and 3-simplices are, 


respectively, vertices, edges, triangles, and tetrahedra. A simplicial complex is then a collection of simplices 
of different dimensions properly glued together. We say that C has dimension m if it has at least one 
m-simplex, but does not have simplices of higher dimension. Clearly, m ^ n — 1. 
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FIGURE 2. Simplicial complex and its adjacency tensors. In this example, C € Cio, 
dimC = 3, and the non-zero elements a- ld of adjacency tensor a d, d = 1,2, 3,4, are: = 1 for all 

ii = 1, • ■ ■, 10; a i2 = 1 for i 2 = (1,4), (1, 5), (2,4), (2, 5), (3,4), (3, 5), (4, 5), (4, 6), (4, 7), (5, 6), (5, 7), 
(6, 7), (7, 8), (8, 9), (8,10), and (9,10); a i3 = 1 for i 3 = (2,4, 5), (4, 5,6), (4, 5, 7), (4, 6, 7), and (5, 6, 7); 
a i4 = 1 only for u = (4, 5, 6, 7). The edge {4, 6} is not visible because of the 3-simplex {4, 5, 6, 7}. 


Let C n be the set of all simplicial complexes on n vertices. By analogy with graphs, where there exists a 
one-to-one correspondence between Q n and the set all boolean symmetric n-by-n matrices with zeros on the 
diagonal, known as adjacency matrices, we can represent C n by a tensor product 

n 

Cn = (g)a d , (2.1) 

d—l 

where &d = {a,i lt ...^ d }, ij = 1 ,. .. ,n, j = is a boolean symmetric tensor of order d with zeros on 

all its diagonals. These conditions require precisely that = ai K(1) ,...,i K(d) for any permutation k of 

subsubindices 1,... ,d, and i d = 0 if ij = ik for any pair of j and k. The non-redundant elements of 
tensor a d are thus a\ d , where multi-index denotes a d-tuple of indices with increasing values: 

Id — ih ■ • • i ^d, (2.2) 

1 ^ ii < ... < id ^ n. (2.3) 

The only requirement for (^)(] =1 a^ to be in bijection with C n is then the following compatibility condition: 

d 

a- ld = 1 => b id d = JJ a.fc = 1, where (2.4) 

k =1 

i d = ^ 1 ? • * * ? ^ki • • • 5 id (2-5) 


is the (d — l)-long multi-index obtained from multi-index by omitting index ik. It is useful to think of 
as the result of operation (-) fc , which is the deletion of the k th index, applied to multi-index i^. Condition 
(2.4) simply formalizes the requirement that if the complex contains simplex {id}, then it also contains all 


its faces. 

For a simplicial complex C £ C n 


a d = {oi d } is thus its “adjacency” tensor that encodes the presence of 


(d— l)-simplices: a- ld = 1 if {i^} £ C, and zero otherwise. Since we assume that C has n vertices, we trivially 
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FIGURE 3. Empty and filled skeletons. The left simplicial complex is the 1-skeleton C (1) of 
the complex C in Fig. 1 The filled 1-skeleton on the right is obtained by adding all 2-simplices 
based on triangular subgraphs of The 3-simplex {4, 5, 6,7} £ C does not belong to . The 
filled skeleton is not a subcomplex of C since, for example, {8, 9,10} ^ C. The filling operation 

is denoted by -^-K. 


have ai = l n = (1,..., 1). Figure [2] illustrates the correspondence between simplicial complexes and their 
adjacency tensors. 

A subcomplex of C is a subset C' C C that is also a simplicial complex. The d-skeleton of C, denoted 
C'(d) ) 

is a subcomplex consisting of all /c-simplices of C with k < d. The 1-skeleton of a simplicial complex, 
for example, is a graph. 

Definition 1. The filled d-skeleton, denoted C^ d+1 \ is a simplicial complex 

C [d+1] = C (d) LI { { i d +2 } : b id+2 = i} . (2.6) 

In other words, C^ d+1 ^ is obtained from by adding (d + l)-simplices as follows. For every (d + 1)- 
simplex {i^ +2 }, if contains all (d + 2) d-simplices {i(j +2 }, ^ = l,---,d + 2, we add {id+ 2 } to C^ d \ 
Intuitively, we add {id+2} if its d-dimensional boundary is already in C^ d \ Note that in this case we add 
{id-i _2 } even if {i c / +2 } ^ C, and, therefore, C is not necessarily a subcomplex of C. For example, C*W 
is a complete graph on n vertices, and is the 1-skeleton of C with all its triangular subgraphs filled by 
2-simplices. We denote the filled d-skeleton by C , [' i+1 l (instead of C^), to emphasize that generally it has 
dimension (d + 1). Figure [3] illustrates the construction of a filled skeleton. 

Thus, we have the following hierarchy of “empty” and “filled” skeletons: 


CM C^l C 1 !™! 



F = <7(°) c C^c c C^c ... cC 1 ^" 1 ) c C( m )=C, 

where denotes the filling operation. Let fd denote the number of d-simplices in C ^ (and therefore in 
C), and (j>d be the number of d-simplices in C^ d f By construction, 4>d ^ fd, and 

/d = E and ^ d = Yl bi *+i- 

id+i id+1 

Figure [6] shows all simplicial complexes C £ C 3 and the values of / 1 , f 2 , and <^ 2 for each C. 


(2.7) 
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3. Exponential Random Simplicial Complexes 

Let S be any subset of C n , {xi,... , x r } be a set of functions on S, Xi : S — > R, and {xi, ..., x r } be a set 
of numbers, Xi £ R. We define the exponential random simplicial complex (ERSC) as a maximum-entropy 
ensemble of complexes with “soft” constraints that require the observables Xi to have the expected values 
Xi in the ensemble. 

Definition 2. An exponential random simplicial complex ERSC(<S, { x \}, {a;*}) is a pair (<S, P), where P is a 
probability distribution on S that maximizes the entropy 

S(P) = - E P (C) InP(C) -»■ max, (3.1) 

ces 

subject to the following constraints 

Ep[xi] = E x i ( C ) F ( C ) = Xi , (3.2) 

ces 

E p ( c ) = !• ( 3 - 3 ) 

ces 


An exponential random simplicial complex is thus a descriptive model for random simplicial complexes. 


Generative models have been recently introduced and analyzed in 15,6 49 


We can define ERSC for any set of simplicial complexes, but, for most of the paper, we restrict ourselves 
to C n and its subsets. If we use S = Q n C C n , then we recover the definition of ERGs. As with ERGs, 
the solution of the constrained optimization problem (3.1)-(3.3) belongs to the exponential family, hence the 
name of the ensemble. 


Theorem 1. The maximum-entropy distribution P defined by (3.1 can be written as follows 

o-H(C) 


P(C) = 


m ’ 


H{C) = E @i x i(C), Z{d)=Y J 


-H{C) 


(3.4) 


i-1 


ces 


where H(C) is the Hamiltonian of simplicial complex C £ S, Z{6) is the normalizing constant, called the 
partition function, and 9 = (9\,... ,9 r ) are the parameters satisfying the following system of r equations 

d '" Z - (3.5) 


d6i 


= Xi. 


The proof is nearly identical to the proof for ERGs 37 , but we give it here for completeness. 

Proof. We use the standard method of Lagrange multipliers to solve the optimization problem 

Let 0\, ... ,9 r and a be the Lagrange multipliers for the constraints in (3.2) and (3.3). The Lagrangian is 

then 


c = - E p ( c ') lnp ( G )+E^ (xi - E*i( c ') p ( c ')) + a (!- E 

ces i =i V ces ) \ c&s 


P (C) 


The maximum entropy is achieved if the distribution P satisfies = 0 for any C £ S. This gives 


or, 


- lnP(C) - 1 - E °iXi(C) - a = 0, 

i=1 

(-E>zi(C9J, 


(3.6) 


(3.7) 


P(C) a exp 


(3.8) 
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which is equivalent to (3.4), since = -*-• ^ remains to check that (3.5) indeed holds: 


d\nZ _ _ 1^ y- -H(C) = I y dH(C) -h(c) 
' W Zd0 *ks Z ks 

= 4 x i (C)e~ H( ' C '> = ^(C)P(C) = 

z ces ces 

since the expected value of the observable Xi in the ensemble is Xi. 


(3.9) 


□ 


4. Simple Examples of ERSCs 

Here we illustrate ERSCs with three simple examples: Erdos-Renyi random graphs G(n,p), random flag 
complexes X(n,p) and Linial-Meshulam random complexes Y(n,p). 

4.1. Erdos-Renyi Random Graphs. Perhaps the simplest nontrivial example of an ERSC is the Erdos- 
Renyi random graph ensemble G(n,p ), which can be viewed as a generative model for 1-dimensional simplicial 
complexes. G(n,p) is a maximum-entropy ensemble with only one constraint that the expected number of 
edges /i in the ensemble is (f£)p |37 : 

G(n,p) = ERSC (g n , h, Qp) . (4.1) 


4.2. Random Flag Complexes. The flag complex X(G) of a graph G £ Q n , also called the clique complex 
or the Vietoris-Rips complex, is a (deterministic) simplicial complex in C n whose 1-skeleton is G and whose 
/c-simplices correspond to complete subgraphs of G, called cliques, of size fc + 1. Since any simplicial complex 
is homeomorphic to a flag complex, simplicial complexes arise in different applications and are often used 
for topological data analysis [50 . 

Kahle [25}|26 defines the random flag complex X(n,p) as the flag complex of the Erdos-Renyi random 
graph, X(n,p) = X(G{n,p )), and studies phase transitions of its homology groups. Here we show that 
X(n,p) is, in fact, an ERSC. 

Proposition 1. Let J~ n C C n be the set of all flag complexes on n vertices, then 

X(n,p) = ERSC (r n , h, Qp) • (4-2) 

Before giving the proof, we comment on what exactly Proposition [T] states. X(n,p) is a generative 
model of simplicial complexes: to generate C ~ X(n,p), one first generates G ~ G(n,p), and then sets 
C = X(G). Let Sx( n ,p) C T n denote the sample space of this random generative process, and Px(n,p) be 
the resulting probability distribution on Sxjn, P ) ■ The random flag complex X(n, p) can therefore be viewed 
as ensemble (<5x( n ,p)iPx:(ra,p))- Proposition |l| claims that {Sx( n ,p) ,P X(n, pj) is a maximum-entropy ensemble 
with Sx( n ,p) = 3~ n and a single constraint that the expected number of 1-simplices is (Tjp- The proof is the 
same as for (|4.1[), but we give it here for illustrative purposes. 


Proof. First, note that any flag complex C £ J~ n can be generated by X(n,p) with a non-zero probability: 

Pxm(C) = P GM (C' 1 >) = /^(i -p)( ; )- /l(C) . (4.3) 


Therefore, S X ( n ,p) is indeed equal to T n . To prove ( |4.2[ ), we need to show that Px(n, p ) is in fact the ERSC 
probability distribution (3.4),(3.5). Since every flag complex C £ T n is completely defined by the adjacency 
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matrix of its 1-skeleton, J~ n = <^) d=1 = l„(^)a 2 . The partition function Z can then be computed as 

follows: 


Z(6 1 ) = Y e~ H ^= Y e~ e ^ = Y e 

C£J„ CeF n a 2 

= ’YjW.e~ eia ' 2 = Yh e ~ 9ia>2 = (1 + e 01 )( 2 ). 

a 2 i 2 i 2 a i 2 =° 


We can now solve (3.5) with X\ = (!))p for the parameter 9\, 

fli=-In-?-, 
1 ~P 


and check that indeed 


This completes the proof. 


Px(n,p)(C) 


e -0ifi(C) 

Z(0i) 


(4.4) 


(4.5) 


(4.6) 

□ 


X(n,p) is 

G(n,p). The rest of the construction, C = X(C^), is fully deterministic. 


Given (4.1), the result in (4.2) is intuitively expected since the random part of generating C 
sampling the 1-skeleton C W 


4.3. Linial Meshulam Random Complexes. Another example of ERSC is a generative model Y(n,p) for 
random 2-complexes. To generate Y ~ Y(n,p), we start with a complete graph on n vertices, the 1-skeleton 
of a future simplicial complex, and add each of the Q) possible triangle faces independently at random with 
probability p. Linial and Meshulam introduced this model in [31] and studied its topological properties. 
In particular, they proved for Y(n,p) a cohomological analog of the celebrated Erdos-Renyi theorem on 
connectivity of the Erdos-Renyi random graphs 17 . The model Y(n,p) can be readily generalized to higher 
dimensions: start with a full d-complex on n vertices, 1 < d < n — 2, and add each of the ( rf " 2 ) possible 
(d+l)-simplices independently at random with probability p. We denote this model by Y d (n,p). The original 
Linial-Meshulam random complex Y(n,p) is then Y\(n,p). 

Let Ci d+1 ^ C C n be a set of all simplicial complexes of dimension (d + 1) or less, and Yd C cif +1) be a 
subset of complexes with full d-skeleton. In other words, 


y d = {C£ C( d+1 > : C (fc) = C [k \k = 1,..., d}. 


(4.7) 


Since for any C £ Yd, the first (d + 1) adjacency tensors ai ,...,& d +i are unit tensors with zero diagonals, 
Yd = a d+2 . 

Proposition 2. The Linial-Meshulam random complex Y d (n : p) is the ERSC ensemble: 


Y d (n,p) = ERSC (y d , f d+ i, + 2 )p) • ( 4 -8) 

Proof. The proof is similar to that for random flag complexes. Given C £ Yd, the probability that the 
complex has been generated by Y d (n,p) is 


Y d (n,p){C) = p fd + l{c) (1 - p)(*+2) A+i( c '). 


(4.9) 
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We need to show that this is in fact the maximum-entropy distribution under the constraint E[/d+i] 
The partition function: 


Z(e x ) = Y z- H(G) = E e- ei/d+l(c) = E e 6l '^ a ' d+2 

c&y d c&y d a d+2 

= Y n e ~ Bia ' d + 2 = n E e~ Biaid + 2 = (1 + e Bl ) (<*+=). 

ad+2 id+2 id+2 a i d+2 =0 

The Lagrange multiplier is then 0 1 = — In jzr, and 


-e 1 f d+1 (C) 

P Y d (n, P )(C) = , 

as claimed. 


U>- 


(4.10) 


(4.11) 

□ 


This result is also expected, since the Linial Meshulam random complex is a higher dimensional analog 
of the Erdos-Renyi random graph: sampling from Yd(n,p) is the same Bernoulli trials process as in G(?r,p), 
with the only difference being that now we are creating (d + l)-simplices instead of 1-simplices (edges). 


5. Any Distribution is Maximum-Entropy 

It is a well-known fact in statistics and information theory (e.g. 111) that any discrete distribution P* 
is maximum-entropy under properly specified constraints. Specifically, if one can write — In P* as a linear 
combination A* ft* +£ of some functions {ft*}, then distribution P* uniquely maximizes entropy S'(P) across 

all distributions P that satisfy constraints Ep[ft*] = Ep. [ft*]. In this section, we briefly review this general 

result, and show how it applies to the already considered models G(n,p), X(n,p), and Yd(n,p ), where — InP* 
can be written as a linear combination. We will see in the next section that this result simplifies dramatically 
the proofs for more complicated ERSCs. 

Let us consider a discrete probability space (f2,P*), where is a finite sample space and P* is some fixed 
probability distribution on SI. Let us represent the distribution P* in the “Gibbs form” as follows: 

P*(w) = e -(~ in ?*(“>)) = (5.1) 

where 

H*{u) = -lnP*(w). (5.2) 

Let H* denote the expectation of the function H* : fl —>• M with respect to P*, which is exactly the entropy 
ofP*, 

H* = E P . [H*] = Y = 5(P*). (5.3) 

Lemma 1. The probability distribution P* is the solution of the following optimization problem: 

S(P) = — Y^ p ( w ) lnP(w) -A max, (5.4) 

subject to the constraints 

Y p M = 1 and Ep[if*] = - E P ( w ) In p *H = H*. (5.5) 

In other words, Lemma [T] states that any discrete probability distribution is a maximum-entropy distri¬ 
bution. The entropy maximization is across all possible distributions P satisfying the constraint that the 
expected value of H* in distribution P is equal to H*’s expected value in distribution P*, which is P*’s 
entropy. In what follows, we will need a more general version of Lemma [I] 
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Suppose that function H* in (5.21 can be written as a linear combination of r other functions h* : tt —> R: 

(5.6) 


H*(u) = - lnP*(w) = Y 

where A*, ^ €= R are constants. Let h* denote the expectation of h* with respect to 

h* = Ep«[/i*] = Y^ h*(uj)P*(uj). 


(5.7) 


Lemma 2. (Th. 11.1.1 lip The probability distribution P* is the solution of the following optimization 
problem: 

S( P) = — Y^ P(w) lnP(w) -A- max, (5.8) 

subject to the constraints 

P(w) = 1 and Ep[/i*] = 'Y / /i*(w)P(w) = h*, i = l,...,r. (5-9) 

bJ £ O OJ £ 

We note that the main utility of Lemma[2]is not in observing that P* (w) oc e~ Xihi ^ is a maximum- 
entropy distribution (in fact, Lemma [l] states that any distribution is), but in specifying more general 
constraints ( |5.9[) under which distribution P* is maximum-entropy. Lemma [l] is a special case of more 
general Lemma [21 with £ = 0,r = 1, and Ai = 1. Indeed, in this case H* = h\, and the constraints in (5.5) 
and (5.9) become manifestly identical. Lemma[2]is identical to Theorem 11.1.1 in [ll], but we provide the 
proof here for completeness. 


Proof. Let P be any distribution that satisfies the constraints in (5.9). Then its entropy 


S(P) = - £ PM InPM = - £ PM In 

ueQ '-’ c ° A ' 




= - 7?kl(P || P*) + E ¥{u)H*(u), 


(5.10) 


L 


where _Dkl(P || P*) is the Kullback-Leibler (KL) divergence of P from P*. Since the KL divergence is always 
non-negative, 


s(p) < y p H#» = y p n E a a*m+ £ 






\i= 1 


(5.11) 


=E +e - E a a*+£ = A p * )■ 

4=1 4=1 

This shows that P* indeed maximizes the entropy. The uniqueness follows form the fact that Z?kl(P IIP*) = 0 
if and only if P = P*. ' □ 

Lemmas m can be formulated for any ensemble of discrete “objects,” including sets of graphs and 
simplicial complexes. Using the notation introduced in Definition[2]and applying the Lemmas to O = S C C n , 
we can concisely write Lemma |T] as 

(S,P*) = ERSC(S,H*,H*) , (5.12) 

and Lemma [2] as 

(<S,P*) = ERSC (5, {/i*}, {/i*}) • (5.13) 

In many cases, Lemmas TO are not useful, since for many generative models (<S, P*) the distribution 
P* cannot be explicitly written in the Gibbs form with linear Hamiltonian (5.6). Moreover, in generative 
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models, the complexity of the generating algorithm often makes it impossible to even explicitly compute the 
probability P*(C) for a given C. Even in the preferential attachment model [2], where the algorithm which 
generates a network appears to be fairly simple — a new node connects to existing node i with probability 
proportional to its degree pi oc ki — the resulting distribution is unknown. However, if we do know P* as 
a function of observables, P*(C) oc e~'^ i =i Xihi ( c \ Lemma [ 2 ] is very helpful in representing (<S,P*) as an 
ERSC. 

Indeed, let us briefly see how Lemma [2] applies to the already considered generative models. For G(n,p), 
the probability of a graph G € Q n in the model is 

P G (n, P ) =p /l(G) (l -p)W~ fliG) . (5.14) 


The corresponding Hamiltonian is then 


H G (n, p) (G) = - A(G) hip - (Q - fi(G) \ ln(l — p) 
=ln ~y~ ( 2 ) ^ ~ p ^ 


(5.15) 


K(G) 


wher e the bottom notations refer to the notations in Lemma |2j The observation that G(n,p) is an ERG 
(4.1) then follows from Lemma[2j since Ep G( „ p) [/ 1 ] = (typ- Similarly, for X(n,p) and Yd(n,p), 


H x{n , P )(C) = In (C) - Q ln(l — p), 


fi =E x{ „, p )[fi] = ( 2 )p, 


Hy d{n AC) = ^ J ln(l - p), 


(5.16) 


fd+ 1 ®'Yj(n,p) [/cZ+l] i d _|_ 2 


and the observations (4.2) and (4.8) that these ensembles are ERSCs are direct corollaries of Lemma [ 2 ] 

The main point of this section is that in case the probability distribution is a known exponential function 
of a linear combination of structural observables, the computation of the partition function, which tends to 
be a nontrivial task in general, is not necessary to show that the distribution is the unique maximizer of 
entropy across all the distributions that satisfy the constraints that the expected values of these observables 
are equal to their expected values in this distribution. 


6. Kahle’s A-Ensembles 


We now turn to a more general model which contains the Erdos-Renyi random graphs, the random flag 
complexes, and the Linial-Meshulam complexes as special cases. In a recent survey 27 , Kahle introduced the 


following multi-parameter model A(n;pi,... ,p n - 1 ) which generates random simplicial complexes inductively 
by dimension. First, build a 1-skeleton by putting an edge between any two vertices with probability p-\. 
Then, for d = 2,..., n— 1, add every d-simplex with probability Pd , but only if the entire (d— 1 (-dimensional 
boundary of that simplex is already in place. More formally, we have the following definition. 
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Definition 3. The Kahle model A(n;p±,... ,p n -i) is a random simplicial complex model that generates 


= !,-■■ 

,n- 

1, for every i d +i 

if K+i 

= 0 

=> set a- ld+1 = ( 

if K+i 

= 1 

=> set a- Id+1 = 


1 with probability pd, 

0 with probability 1 — Pd- 


( 6 . 1 ) 


Topological properties of the Khale model are studies in [8 -10 . Here we study its maximum-entropy 
properties. In Appendix |A.1| we prove the following proposition. 

Proposition 3. Let C ~ A(n;pi,... ,p n - i)- The expected numbers of d-simplices in C^ and are 

j d— i 

/d+i', 

Ad-k) 


fd = 


n 

d + 1 


n»P »<< 

k =1 x ' 7 k =1 

The Kahle model unifies all the random simplicial complexes we have considered so far: 

G{n,p) = A(n;p,0,... ,0), 

X(n,p) = A(n;p, 1,...,1), 

Y(n,p) = A(n; l,p, 0,..., 0), 

Yd{n,p) = A(n; l,.„ ,l ,p,0, ...,0). 

d 

Since all these special cases are ERSCs, it is natural to expect that so is A(n;pi,... ,p n -i). We cannot 
prove this using the same method as for the Erdos-Renyi random graphs and the random flag and Linial- 
Meshulam complexes in Section |4j As with ERGs, analytical computation of the partition function Z(6) 


n 

d+1 


m d 


( 6 . 2 ) 


(6.3) 


for ERSCs is rarely possible, and G(n,p), X(n 7 p), and Yd{n,p) are lucky exceptions. In Appendix A.3 


we 

illustrate difficulties one has to be prepared to experience when attempting to compute the partition function 
for A(n;pi,... ,p n - 1 ) with n = 3. However, with the help of Lemmas |Tp[2] in Section [5] there exists a simpler 
alternative proof. The fact that A(n;pi,... ,p n - 1 ) is an ERSC is a direct corollary of those lemmas. 

Theorem 2. The Kahle A-ensemble is the ERSC ensemble: 

A(n; Pl ,... , Pn - 1 ) = ERSC (C„, {{fd^ZlAM^} , {{/<*}£ IMZ 2 }), (6-4) 

where fd and f> d are the expected numbers of d-simplices in C'C) and . 

Proof. For any C £ C n , the probability Pa(C) that A(n,pi,... ,p n -i) generates C can be computed by 
induction: 

n —1 n —1 

p A (c)= n p A (c (d) c (d_i) ) = n Pd {c \ i - Pd) Mc) ~ fd(c) . (6.5) 

d—l d —1 

Indeed, given the (d — l)-skeleton CC~ l \ the maximum possible number of d-simplices in is exactly 
<t>d{C), the number of d-simplices in the filled skeleton C^. Since each of these d-simplices appears indepen¬ 
dently with probability pd, the conditional probability Pa (C^ | C ( ^ d ~ 1 ' > ) = p^ C \l — where 

fd(C) is the actual number of d-simplices in CC\ 

The Hamiltonian of C is therefore 

n-1 , _ 

Ha(C) = ]T fd{C) In + MC) In 


d—l 

n—1 


Pd 
1 ^ Pd 


Pd 


= £ fd(C) In —™ + £ MC) In - 
Pd V 1 


d =2 


1 


~Pd 


In 


1 


( 6 . 6 ) 


1-Pi’ 
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FIGURE 4. Sampling from A with n = 10. Panel (a) shows the starting point: n = 10 potential 
0-simplices (vertices) represented by empty dots. At the first step, shown in Panel (b), we create 
the 0-skeleton of a future complex C by including (excluding) every vertex i in (from) C*- 0 -* with 
probability pi (with probability 1 — Pi ). In this example, five vertices represented by filled dots 
belong to C^°b Next, in Panel (c), we generate the 1-skeleton by creating 1-simplices (edges). Each 
of the (®) possible edges {ia} = {1,4}, {1, 5},..., {7, 8} appears in with probability p- I2 . We 
represent the accepted (rejected) edges by solid (dashed) lines. Finally, in Panel (d), we create the 
2-skeleton by adding 2-simplices (triangles). There are only two possible triangles in C {1,4,5} 
and {5, 7,8}. Here, the former (empty) was rejected with probability 1 — P 145 , and the latter (filled) 
was accepted with probability p$ 7 s. 


since </>i(C) = (™) for any C. Using Lemma [ 2 ] with 

{h*} ={/i, ■ ■ ■, fn- 1,02, • • •, 

{Aj} = {In 1 Pl ,..., In Pn ~ 1 ,\n 


Pi 


Pn—l 


1~P2 


,...,In 


1 - Pn -1 


£ = 


In 


1-Pi' 


completes the proof. 


(6.7) 


□ 


7. General Random Simplicial Complexes with Independent Simplices 

Finally, we introduce and consider the most general case of random simplicial complexes with statistically 
independent simplices. In this case each simplex has its own individual probability of appearance. To stay 
as general as possible, we must allow for even the 0-simplices (vertices) to be present with any probabilities, 
which are not necessarily equal to 1. We denote this new model by A(n; pi,..., p n ), or A for brevity, 
where Pd = {Pi d } is a collection of (j) appearance probabilities for each (d — l)-simplex. Whereas in A in 
the previous section, the subindex d in pd refers to the simplex dimension, in A the sub-multi-index i^ in 
p- ld refers to the specific (d — l)-simplex {i f z}- To generate C ~ A, we first create its 0-skeleton by having 
vertices {ii} £ C with probabilities p- n , ii = 1Then, for d = 1,... ,n — 1, we add every d-simplex 
{id+i} with probability Pi d+1 , but only if the entire (d — l)-dimensional boundary of that simplex is already 
in place. Figure [4] illustrates the generation of a 2-complex from A with n = 10. 

More formally, we have the following definition. Let C< n = U£ =1 Cfc denote the set of all simplicial 
complexes with n vertices or less. As in Section [2j any C £ C< n is uniquely determined by a collection of its 
adjacency tensors a d = {«i d }, d = 1,..., n, except that now a.i is not necessarily equal to the all-ones vector 
l n , since C may have less than n vertices. 
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Definition 4. The model A(n; pi,..., p ra ) is a random simplicial complex model that generates C £ C< n 
as follows: for d = 0, ..., n — 1, for every i d+1 , 


if K+i = 0 
if & i d+ i = 1 


set a id+1 
set a id+1 



with probability Pi d+1 , 
with probability 1 — pi d+1 ■ 


(7.1) 


Here, for convenience, we use the convention that = 1 for all ii = 1,..., n. If pi d+1 = Pd for d = 
0,... ,n — 1 and po = 1, then we recover the original Kahle’s model A(n;pi,... ,p n -i) from the previous 
section. 

To give the expressions for the expected values of the observables a- ld and b- ld in A(n; pi,..., p n ), we 
need a bit of new notation. Let multi-index k m = k±,... ,k m denote an m-tuple with increasing values 
1 < fci < ... < k m < d, and i d m = ii ,..., ,..., ik m , ■ ■ ■, id be the (d— m)-long multi-index with ■ ■ ■, ik m 

omitted, with a convention that kg = 0 and i|)° = i d . In Appendix 


A.2 


we prove the following proposition. 


Proposition 4. Let C ~ A (n; pi,..., p„). The expected values of the observables ai d and bi d are 


d —1 d —1 

<=n and =n 

m—0 k rn m—1 k m 

Lemma [2] helps again to prove that the general model A is also an ERSC. 
Theorem 3. The A.-ensemble is the ERSC ensemble: 


A(n; Pl ,..., Pn ) = ERSC (C<„, {{a id } n d=1 ,{h d } n d=2 } , {{d u } n d=1 , {b id } n d=2 }) 
where d- ld and b- ld are the expected values of the observables ai d and b- ld . 

Proof. The probability that A(n; pi,..., p n ) generates C £ C< n is 

n— 1 


Pa(C) =Pa (c (0) ) Y[ p A 

d= 1 

n— 1 

=nrf.'‘(i>< n ncru-j-w.)" 


d+1 *d+l 


=nn^a- P ij 

d= 1 id 


d— 1 id+i 
b u- a ‘d 


The Hamiltonian of C is then 

n 

H±(C) = £EKm 


d—1 id 


1 - Pid , > , 1 

-1- bi In ■ 

Vi d 


1 ~Pid. 

=£ £ a ^i d + £ £ Pitht + £ ln 

d= 1 id d= 2 id ii 11 

where oc ld and /3i d are the Lagrange multipliers coupled to observables a\ d and b- ld , 


a- Id = ln -—— and f3- ld = ln 


Pid 


1 ~Pi d 


(7.3) 


(7.4) 


(7.5) 


(7.6) 








EXPONENTIAL RANDOM SIMPLICIAL COMPLEXES 


15 


Using Lemma [5] with 


completes the proof. 


{h*} = {{aij,..., {aij, {b - l2 },..., {5i„}} , 

{A*} = {{aij,..., {ai„}, {AJ, ■ • -, {A„}} , 


£ 


E ln 


1 

1 - Ph ’ 


(7.7) 

□ 


8. Discussion 


In summary, exponential random simplicial complexes (ERSCs) are a natural higher dimensional analog 
of exponential random graphs which are extensively used for modeling network data and statistical inference. 
An ERSC ensemble is a maximum-entropy ensemble of simplicial complexes under “soft” constraints that fix 
expected values of some observables or properties of simplicial complexes. We have developed the formalism 
for ERSCs, and introduced the most general generative model of random simplicial complexes A with 
statistically independent simplices. This model has as special cases several popular models studied in the 
literature: Erdos-Renyi random graphs, random flag complexes, Linial-Meshulam complexes, and Kahle’s 
A-ensembles. As all these models, A is an ERSC ensemble. The constraints in this ensemble are expected 
number of simplices and their boundaries. 

This result is a direct corollary of the general observation that any probability distribution P is maximum- 
entropy under the constraint that the expected value of — In P is equal to the entropy of P. This observation 
dramatically simplifies the representation of many ensembles of random simplicial complexes as ERSCs since 
the calculation of the partition function is no longer needed. For example, to show that the Erdos-Renyi 
random graphs G(n,p) are exponential random graphs with a given expected number of edges, one does not 
really have to calculate the partition function. This calculation is trivial in the Erdos-Renyi case or in the 
general case of exponential random graphs with statistically independent edges 37 . However, the analogous 
calculation for the general case of random simplicial complexes A with statistically independent simplices 
appears to be intractable. 

The multi-parameter model A(n;pi,..., p„) is the ERSC ensemble with two types of constrained ob¬ 
servables: {ai d } and {6i d }. The observables of the first type are simplices themselves: ai d (C) = 1 if the 
(d — l)-simplex {id} belongs to C, and zero otherwise. The observables of the second type are their bound¬ 
aries: b- ld (C) = I if the entire (d— 2)-dimensional boundary of simplex {i^} belongs to C , and zero otherwise. 
Theorem [ 3 ] states that A (n; pi,..., p„) is a solution of the following optimization problem: 


S(P) 


max, 


E p ( c ) = 1 > 


cec< 


E p[«iJ = a id ,d = !,... ,ra, Ep[biJ = b id ,d = 2,...,n. 


( 8 . 1 ) 


( 8 . 2 ) 


If we drop the observables of the second type in this optimization problem, we alter the maximum-entropy 
distribution as illustrated in Figure [5j Since the distribution has changed, ensemble A(n; p 1; ..., p„) = 
ERSC (C<„, {ai d }2 =1 , {di d }d =1 ) defined by this distribution is now also different from A(n; pi,..., p„). 

The fact that the second type of boundary-presence observables are also constrained in A(n; pi,..., p„) 
may appear quite unexpected at first glance. The reason for the presence of these constraints is that 
simplex existence probabilities are actually conditional, where the conditions are the presence of simplex 
boundaries. If we go from conditional to unconditional probabilities, we change A to A. Indeed, in A, p- ld 
is the conditional probability of the (d — l)-simplex {i^} to appear in C, given that its (d — 2)-dimensional 
boundary is already in place, 

Pa (qi d = Mid = 1) = PA(Qi d = 1) 

PA(6id = l) PA(&i d =l)’ 


PU = Pa (<ii d = l \K = 1) 


(8.3) 
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FIGURE 5. Constrained entropy maximization. The surface represents the Gibbs entropy S, 
which, in this schematic example, is a function on the set of all probability distributions on C<„. 
The global maximum corresponds to the uniform distribution U, which is the maximum-entropy 
distribution among all distributions supported on C<„. Theorem [3] shows that if we have two sets 
of constraints, Ep[ai d ] = a- ld and Ep[fei d 1 = ^, then the resulting maximum-entropy distribution is 
Pa- If we drop the second set of constraints, then we get some other maximum-entropy distribution 
Pa ^ Pa for ensemble A / A. 


where the last equation follows from the compatibility condition a- Id = 1 => b- ld = 1. This means that the 
unconditional probability of having {i^} £ C is 

F A(«i d = 1) =Pi d P^(b id = 1), (8.4) 

and, therefore, the expected values of observables a- ld and b- ld satisfy 

a id = EaKJ = PA(fli d = 1) = Pi d W±{bi d = 1) = Pi d b id - (8.5) 


Thus, if we want to represent A as an ERSC and we fixed the expected values of the observables of the first 
type a- ld , we must also fix the expected values of the observables of the second type b- ld . Moreover, these 
expected values are not independent and must satisfy di d = Pi d b\ d , which is consistent with Proposition [ 4 ] 
In Appendix A.4 we consider a special case with n = 3, p q = 1, p- i2 = pi, and p- l3 = P 2 , and explicitly show 
that the maximum-entropy distributions with and without the second type constraints are different. 

To conclude, A/A. From the maximum-entropy point of view, the ensemble A, with only the observables 
of the first type constrained, appears more natural than A. Yet A is more natural than A in terms of 
simplicity of its constructive Definition [4] that allows for efficient sampling of simplicial complexes. We leave 
open the questions of whether there exist ways to calculate the probability distribution Pa(C) in ensemble 
A, and to efficiently sample from it, i.e., to easily generate simplicial complexes C with this probability. 
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Appendix 


A.l. Proof of Proposition [ 3 } Let us first compute the expected number of d-simplices in C+ where 
C ~ A(n; pi,... ,p n -i). 


(j)d =E [<j> d ] = E 


= E 


E^ +1 

= E 

id+i 



d+1 

eii<+ 


ld+1 


k=1 



d+1 



d+1 



E 

e n a ij +1 

id+i fe=1 

ad-i 

= E 

EII e 

id+1 fe=l 

. A d+1 

ad-i 


(A.6) 


If the boundary of (d — l)-simplex {ijj +1 } belongs to C, i.e. + = 1, then {i^ +1 } £ C, i.e. a.*, = 1, with 

d+1 d+1 

probability pd-i- Otherwise, if 6.*. = 0, then automatically a.j, = 0. Therefore, the inner expected value: 

*d+l 1 d+l 


E 


ad-i 


. *d+l 

. 


= Pd-ib.k 


(A.7) 


So, 


d+1 



En ^-+ +1 

id+i fc=l 

+ 1 

II 

EIH, 

• 1 d+1 

ld+1 k 2 


(A.8) 


where k 2 = k-\. &2 is a pair of indices 1 < k\ < &2 < d + 1, and ijyl= i\, .,., ■. ■, ifc 2 , • • • ,i d + 1 is the 

(d — l)-long multi-index with ik 1 and ik 2 omitted. Proceeding in this manner, we have: 


id =pj±\ E 

=p£ i E 


E 

eii+ 

ad -2 

51 
+ 1 

11 

1 

EII e 

ac 2 

&d -2 

r 

. . *d+l 

ld+1 k 2 

- 

- 

id+i k 2 

*d+l 

- 


En^- 2 ^ 


id+1 K 2 


/d+l\ /d+l\ 

= p [ d -Yp [ d -V e 


/d+i\ /d+i\ 


elk 

id+i kd 


d 

d+1 


n 

d+1 


En°i 

id+1 k 3 
d—1 

n 


k 3 

d+1 


/d+!\ 

Pk * 


fc=l 


(A.9) 


The last equation holds because a^ d = 1 for any \ d +i and k^, since all simplicial complexes C ~ A(n; pi ,..., p n . 

1 d +1 


have exactly n vertices. The expected number of d-simplices in C^ is now: 


fd = E[/ d ] = E[E[/ d |0 d ]] = E \p d <f> d \ = Pd<t>d = 



(A.10) 
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A.2. Proof of Proposition [4j Computations are similar to those in the previous section. 


fri d —— IE 

d 

n® 


ip 


,fc=i 


= E 


E 


=E 


k=1 


a.k 

L *5 


2 




ri a i 


n-i* 

/c=i 

n^5 6 is 

fc=i 

=n^iw 


ad-2 


= E 


fc=l L k 2 

=n^-n^ 

fc=1 k d _i d 

since b.g = 1 for any id and kd-i- Finally, 


d — — 

fc=i k 2 


n y-. 

k d 

kd-i 


=n^ E 

k= 
d 

k 2 ' 
d—1 

n n», 


IPs 

.k =1 


m=l k„ 


< 2—1 


;k m ? 
d 


«i d = E K,] = E[E[a id | 6 iJ] = E[p id &iJ = 


(A.ll) 


(A.12) 


m—0 k„ 


A.3. Special case: A(3;pi,p2)- Theorem|2]in Section [ 6 ] explicitly represents the Kahle’s multi-parameter 
model of random simplicial complexes A(n;pi,... ,p n _i) as an ERSC for any values of the parameters. 
This theorem is a direct corollary of Lemmas M in Section [5] which assert that any distribution is, in 
fact, the maximum-entropy distribution under certain constraints. Here we illustrate the difficulties that 
arise when one tries to compute the maximum-entropy distribution Pa using Theorem [l] We successfully 
used this method, which is based on computing the partition function, in Section [4] for the Erdos-Renyi 
random graphs and the random flag and Linial-Meshulam complexes. For Kahle’s A-ensemble, however, the 
partition function becomes intractable. 

Consider a special case of the Kahle’s model with n = 3. According to Theorem [2] and Proposition [3j 
A( 3 ;pi,p 2 ) is the maximum-entropy ensemble of simplicial complexes on 3 vertices with three constraints: 

E [/i]=3pi, E[/ 2 ] = P 1 P 2 , E [ ( /) 2 ]=p 3 1 . (A.13) 


Let us compute the corresponding maximum-entropy distribution Pa( 3 ;pi,p 2 ) using TheoremJTJ The partition 
function Z in (3.41 is 


Z{6 1 ,d 2 ,9 3 )= ^2 e~ H{c) = ^2 e _ei/l ( c )-^/2(C)-0302(C) 

cec 3 cec 3 (A.14) 

=1 + 3e" Sl + 3e" 2Sl + e -301- ® 3 + 


where the last equality follows from Figure | 6 j where we list all complexes in C 3 along with the corresponding 
values of observables /i,/ 2 , and <f> 2 - To find parameters 9 1 , 62 , and $ 3 , which are the Lagrange multipliers 
coupled to observables f±, / 2 , and (j> 2 , we need to solve the system of three equations (3.5), where x t are 
replaced by the expected values in (A.13): 


3e -ei + 6 e~ 2Sl + 3e~ 38 l e ~ 83 + 3e~ 38 l e~ 82 e ~ 82 
1 + 3e~ dl + 3e -2ei + e~ 3 ei e ~ 83 + e~ 3 ei e~ 82 e - 03 

e -38 le -e 2e -e 3 


= 3pi, 


1 + 3e ~ 01 + 3e _2ei + e~ 301 e ~ 83 + e~ 301 e~ 82 e - 03 


-8 __a, P 1 P 2 , 


e -38 le -8 3 


e -36i e -82 e -8 


1 + 3e ~ 01 + 3e -2ei + e~ 301 e ~ 83 + e- 301 e~ 82 e - 03 


= pI 


(A.15) 
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FIGURE 6. Simplicial complexes on three vertices. Here we show all C € C 3 and the values 
of fi (number of 1-simplices), / 2 (number of 2-simplices), and <j >2 (number of 2-simplices in C^) 
for each C. 


After some tedious algebra, one can show that the solution is 

p-0i _ P 1 e -s 2 _ Pi 

1 - Pi 1~P2 

The partition function simplifies then to 

~ 1 
(1-Pi) 3 ' 


e 63 = 1 -p 2 . 


Therefore, the maximum-entropy distribution is 

e -H{c) -ei/i(C)-e 2 / 2 (C)-e302(C) 

PA(3;pi,p 2 )(C) = = 

= (1-Pl) 3 ( 

= p/ l( c )(1 _ pi) 3 -M c ) p M c ) {1 _ p2) M c )-h( c )_ 


Pi 


Pi 


Z 

h(C) 


P2 


1 ~P2 


h{C) 


(1 -P2) MC) 


(A.16) 


(A.17) 


(A.18) 


As expected, the obtained distribution coincides with the distribution in ( |6.5[ ), where n = 3 and 4>\(C ) = 3. 
Unfortunately, this method of computing Pa cannot be extended to the general case A(n\pi,... 
when n > 3 the partition function Z and the corresponding analog of system ( |A.15 ) become analytically 
intractable. This makes Lemmas [l]k[2] an essential tool for proving Theorem[2]and a more general Theorem[3] 


A.4. ERSC(C 3 , {/ 1 , / 2 }, {/ 1 , / 2 D. Here we derive the maximum-entropy distribution on C 3 only under the 
constraints of the first type, E[/i] = /1 and E[/ 2 ] = fi , and show that it is different from Pa( 3 , P i, P 2 )- This 
explicitly demonstrates that the constraint of the second type, E[^ 2 ] = is not redundant, and, if dropped, 
the resulting maximum-entropy ensemble will no longer be A. 
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Let (C3, P) be the maximum-entropy ensemble ERSC(C 3 , {/ 1 , / 2 }, {/ 1 , /2D- In other words, P is the 
maximum-entropy distribution on C 3 under the constraints 

E[/ 1 ] = /i and E[/ 2 ] = / 2 . (A.19) 

We can find P using Theorem [l] as in the previous section. The partition function 

Z(0 lt 0 2 ) = Y e_5(C) = E e“ 0l/l(c)_S2/2(c) 

cec 3 cec 3 (A.20) 

= (l + e -«i) 3 + e -3» le -e 2; 

where the last equality is obtained with the help of Figure [fij The system of equations (3.51 for 0\ and 0 2 is 
then 


„-e 1 


(1 


0 2 + e 


— 30 i „—02 


(1 + e-»i) +e- 3 fl ie-*» 

e -3fli e -e 2 


=/i, 


=/ 2 , 


(A.21) 


(1 + e -6 * 1 ) 3 + e _ 301 e _02 
and one can check that the solution is given by 

L -h , ._ 0a _./2(l-/ 2 ) 2 


e “ 01 = -A 


1 _ A 
x 3 


and e 2 = 




The partition function, as a function of f\ and / 2 , is then 

(1 -h) 2 


Z = 

Therefore, the maximum-entropy distribution is 

„ e -v(c) e -e 1 f 1 (c)-e 2 f2(c) 


M)‘ 


(A.22) 


(A.23) 


( 


Z 

-*)’ 


lx 


Z 

7 T \ fl(C) 


h\ 


U 1 - A) 2 


(i-« a u-w U4 -a) ! 


= |A-a 


/i(C)-3/ 2 (C) 




j N 3-/x(C) 


/ 2 (C) 


It (C) (1 - h) 


(A.24) 


J N 2/2 (C)-2 


This is a general expression for P for any expected values / 1 and / 2 . In the special case, when fi and 
/ 2 coincide with the corresponding values for A(3;pi,p 2 ) in (A.13), that is /1 = 3/q and / 2 = pfp 2 , the 
distribution P reduces to 


= p{ liC \ 1 - Pl ) 3 -^( c )p 2 /2(C) (l - pfp 2 )/AC)- 3 / 2 (0(i _ p 3 p 2 ) 2 A( c )- 2 . 


(A.25) 


We see that P 7 ^ Pa( 3 ;pi ) 2 j 2 )- This means that the two maximum-entropy ensembles A(3;pi,p 2 ) and (C 3 ,P) 
are different, 

ERSC(C 3 , {/ 1 , / 2 , <M, {/i,/ 2 , <M) ^ ERSC(C 3 , {/ 1; / 2 }, {/ 1 , / 2 }), (A.26) 

and, more generally, 

ERSC (c„, {{f d } n d zUM n d Z 2 1 }, {{/4^, ± ERSC (c„, {/4^, {74^) • 


(A.27) 
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